Statistical Learning of Word Acquisition with Application to Readability Prediction
ثبت نشده
چکیده
Language learning, as expressed through word-acquisition and readability, plays an important role in both psycholinguistic theories and information system engineering. We present a statistical model for document readability that is based on the logistic Rasch model and the quantiles of word acquisition age distributions. We exploit this connection to infer the distributions of acquisition ages from empirical readability data that was automatically collected from the web. Contrasting the inferred acquisition distributions with existing oral studies reveals interesting historical trends as well as differences between the oral and written word acquisition processes. We also demonstrate how the inferred acquisition distributions can be used to predict global and local document readability.
منابع مشابه
Statistical Estimation of Word Acquisition with Application to Readability Prediction
Models of language learning play a central role in a wide range of applications: from psycholinguistic theories of how people acquire new word knowledge, to information systems that can automatically match content to users’ reading ability. We present a novel statistical approach that can infer the distribution of a word’s likely acquisition age automatically from authentic texts collected from...
متن کاملEFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملEFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series
This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...
متن کاملWritten word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کاملApplication of Machine Learning Algorithms for Automatic Knowledge Acquisition and Readability Analysis Technical Report
A large knowledge base is a prerequisite for a lot of tasks in natural language processing (NLP). To build a handcrafted knowledge base, which is applicable to real world scenarios, a vast amount of effort is required. Furthermore, experts are needed with a strong background in linguistics, artificial intelligence and knowledge representation which may not be available to the extent necessary (...
متن کامل